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Abstract 

A meta-analysis was performed to investigate the impact of extensive reading (ER) on 
reading proficiency. This study gathered 71 unique samples from 49 primary studies 
published from 1980 to 2014 involving a total of 5,919 participants. Effect sizes were 
generated separately according to two different study designs: experimental-versus- 
control contrasts and pre-to-post-test contrasts. Small to medium effect was found in both 
study designs. Moderator analysis showed growing interest in ER in the field over the last 
30 years. Also, a higher effect was found in the adults than in the children and 
adolescents group. English as a foreign language (EFL) settings showed a higher effect 
than English as a second language (ESL) settings; and web-based stories had a higher 
effect than paper books. Finally, ER as a part of curriculum showed the highest mean 
effect among ER types. Suggestions are made on how to implement ER in ESL and EFL 
settings effectively. 

Keywords : extensive reading, meta-analysis, reading comprehension, reading rate, vocabulary 


Providing rich input in English is essential for promoting English proficiency. Extensive reading 
(ER) is an excellent way to provide target language input, especially in foreign language settings 
where the target language input is very limited. ER can be defined as “an approach to language 
teaching in which learners read a lot of easy material in the new language” (Bamford & Day, 
2004, p. 1). In other words, ER is a way of learning a language through a great amount of 
reading for pleasure. It is sometimes called as “pleasure reading,” “free voluntary reading” and 
“sustained silent reading.” 

The contribution of ER to various aspects of language proficiency has been confirmed by a 
number of research studies: vocabulary (Horst, 2005; Zimmerman, 1997); reading 
comprehension (Elley & Mangubhai, 1983; Robb & Kano, 2013); reading rate (Beglar, Hunt & 
Kite, 2012; Bell, 2001); and writing (Im, Ahn &Yoon, 2010; Tsang, 1996). Moreover, its effect 
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on the affective domain such as motivation has also been demonstrated by a number of studies 
(e.g., Al-Homoud & Schmitt, 2009; Kim & Hwang, 2006). Despite its attested effect on English 
proficiency, ER is not yet widely practiced in either English as a second language (ESL) or 
English as a foreign language (EFL) settings. In reality, ER is still “an approach less traveled” 
(Day & Bamford, 1998, p. 164). According to a survey by Kim (2002), 94% of high-school 
English teachers in Korea use the grammar-translation method rather than an ER approach for 
teaching English reading in the classroom. 

The aim of this study is to investigate how much effect we can expect from ER intervention 
through the tool of meta-analysis. In recent years, meta-analysis has gained attention in the field 
of second language studies with an increasing amount of research in the field and the growing 
need to integrate the previous findings. Also, computing effect sizes were greatly facilitated by 
the development of software programs such as Comprehensive Meta-Analysis and RevMan. 

In order to investigate overall effectiveness of ER and variables which contribute to this effect, 
the present study posed the following research questions: 

1. What is the overall effectiveness of ER on reading proficiency (reading comprehension, 
reading rate, and vocabulary) in ESL and EFL settings? 

2. To what extent do identification (year of publication), context (age, ESL and EFL setting, 
library size), treatment (length, text type, ER form), and outcome (reading 
comprehension, reading rate, and vocabulary) variables affect the impact of ER? 1 


Literature Review 

There have been some efforts to integrate findings on ER. Krashen (2007) did a meta-analysis on 
the effect of ER on EFL adolescents and young adults. ER was found to have a large effect on 
reading comprehension (d = 0.88) and a medium effect on a cloze test (d = 0.73). Moreover, a 
positive relationship was found between the number of books read per student and reading 
comprehension but no relationship was found between the number of books per student and the 
results of the cloze test. 

Kirn (2012) conducted a comprehensive meta-analysis on ER encompassing the cognitive 
domain as well as the affective domain. The cognitive domain included listening, vocabulary, 
literacy, reading speed, reading comprehension, and writing; while the affective domain included 
interest, confidence, motivation, attendance, attitude, and anxiety. Using Hedges’ (1981) formula, 
the mean effect size for the cognitive domain was 0.62 and the affective domain was 0.40. 

Nakanishi (2015) explored the overall strength of ER and how its effect differed depending on 
the participants’ ages and periods of instruction. The overall effectiveness of ER was medium for 


! We used the term “reading proficiency” to describe proficiency in reading comprehension, reading rate, and 
vocabulary. The rationale for this is based on the wide acceptance of the positive relationship among reading 
comprehension, reading rate and vocabulary in the field of language acquisition. In fact, a high level of reading 
comprehension cannot take place without efficient decoding of print (Adams, 1994) and vocabulary knowledge. 
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both study designs, group contrasts (d = 0.46) and pre-post contrasts (d = 0.71). Moreover, a 
larger mean effect was found in the older participants (d = 0.67) and in the longer programs (d = 
0.52). 

There were a number of shortcomings in the previous research that motivated the present 
research. First, the samples were not large enough to claim the validity of the meta-analysis. 
Krashen (2007)’s meta-analysis included only nine studies for reading comprehension and 14 
studies for the cloze test. Kim (2012) had a larger number of studies (i.e., 21 studies) compared 
to Krashen (2007). This was due to the fact that practically any study on ER was included in his 
investigation under the terms “cognitive domain” and “affective domain.” However, the effect of 
ER can vary depending on the focus of second language (L2) ability (Yamashita, 2008). For 
example, the effect of ER on reading comprehension can be different from its effect on writing. 
Therefore, to obtain a better understanding on effect of ER, it is better to do separate meta¬ 
analyses on different aspects of L2 ability rather than combining them all in a single category 
such as cognitive domain or affective domain. Nakanishi (2015) included the greatest number of 
samples up to now: 22 samples for group contrasts and 21 samples for pre-post contrasts. 

Although Nakanishi’s study (2015) included a large number of samples to claim the validity of 
the meta-analysis, its shortcoming was that a number of studies included in the meta-analysis did 
not properly reflect characteristics of ER suggested by Day and Bamford (2002), particularly, the 
self-selection of books. That is, some of the primary studies included in his meta-analysis were 
more like obligatory assigned reading of long texts (e.g., Kweon & Kim, 2008; Lao & Krashen, 
2000; Lin, 2010) rather than free selective pleasure reading. Also, some studies were not 
specifically targeted for English as a foreign or second language (e.g., Greenberg, Rodrigo, Berry, 
Brinck, & Joseph, 2006: English as a native language; Rodrigo, Krashen, & Gribbons, 2004: 
Spanish as a foreign language). Therefore, it is necessary to redo the analysis by narrowing the 
study samples according to the common conceptualization of ER. 

Furthermore, while previous meta-analyses have demonstrated the overall effectiveness of ER in 
comparison with traditional intensive approaches to reading, there are some unexplored issues 
related to the implementation of ER. For example, no study has explored how variables such as 
setting, library size, text type, and ER implementation affect the effectiveness of ER. Compared 
to previous meta-analyses, this study applied stricter inclusion criteria by only including primary 
studies which truly reflect ER features suggested by Day and Bamford (2002) in either ESL or 
EFL settings. In addition, by investigating some unexplored moderator variables, we tried to fill 
the gaps in the literature by providing answers to the practical issues of ER implementation. 


Method 

Search of literature 

In order to find and select the necessary data for the meta-analysis, exhaustive online and manual 
bibliographical searches were conducted. The educational databases including Educational 
Resources Infonnation Center (ERIC), Research Information Sharing Service (RISS), and 
Dissertation and Thesis, were utilized as tools for the online search. Search terms included 
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combinations of the following key words: ER, pleasure reading, reading comprehension, reading 
rate, vocabulary, ESL, and EFL. In addition, seven applied linguistics journals were searched 
manually: Reading Research Quarterly, Reading in a Foreign Language, TESOL Quarterly, 
Language Learning, Applied Linguistics, ELT Journal, and System. Finally, the ER Foundation 
bibliography was examined which had a list of about 500 references on ER. 

Criteria for inclusion 

The inclusion criteria were constructed in terms of ER and meta-analysis. In order to be included 
in the meta-analysis, the individual studies had to embody the following five characteristics of 
ER from Day and Bamford (2002, pp. 137-140): 

i. The reading material is easy (suitable for learners’ levels). 

ii. Learners choose what they want to read. 

iii. Learners read as much as possible. 

iv. Reading is individual and silent. 

v. Teachers orient and guide their students. 

While investigating related studies, we noticed that there was a lack of a common understanding 
of what ER is. For example, many studies which claimed to have adopted an ER approach turned 
out to be, in fact, assigned readings of long texts. Such studies were excluded in our meta¬ 
analysis since they did not properly reflect the self-selected reading principle of ER. This 
principle is important since students were found to have higher satisfaction with books of their 
choice over obligatory assigned reading (e.g., Hayashi, 1999; Park & Kang, 2004; Sheu, 2003). 

In tenns of meta-analysis, studies had to satisfy the following criteria: 

1. Studies should adopt either experimental or quasi-experimental design which has 
quantifiable data for meta-analysis. That is, in order to calculate its effect, a study 
should have statistical information such as means, standard deviations, number of 
participants, t-value, p-value and so on. Some primary studies that did not report 
enough statistical information to compute effect size had to be excluded. Often 
studies failed to report the standard deviation and those studies had to be excluded 
(e.g., Krashen, 1989). Results should indicate changes in learners’ reading 
comprehension, reading rate, or vocabulary. 

2. The studies from 1980 to 2014 were included because studies on ER in ESL/EFL 
settings were extremely hard to find prior to 1980. Even if there were, in most cases 
they did not contain necessary information to compute effect sizes, nor did they 
adopt principals of ER advanced by Day and Bamford (2002). Thus, only the studies 
from 1980 onwards were included. 

Using the above mentioned criteria, 51 samples from 32 primary studies were selected for 
experimental- vs. control-group design and 20 samples from 17 primary studies were selected for 
pre- vs. post- test design. 
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Coding of study reports 

The details of each study were coded as shown in Table 1. First, studies were identified in terms 
of author, title, year, and type of publication. Then they were coded with respect to context, 
treatment, and outcome variables according to the research questions. Contextual infonnation 
included the setting of the studies, participant’s age, and library size. Participants’ ages were 
classified into three age groups: children, adolescents, and adults. The children’s group involved 
participants in elementary school, the adolescent’s group from middle school to high school, and 
the adult’s group from college level and above. Library size was classified into three categories 
depending on how many books were made available to students. Some studies which used web 
stories or did not report the number of available books were excluded. Libraries consisting of 
fewer than 100 books were coded as small, from 100 to 999 books as medium, and over 1,000 
books as large. 


Table 1. Coded data from primary studies 


Variables 


Values 



Identification 

Author 

Title 

Year of publication 

Type of publication 

Context 

Article 

Report 

Dissertation 


Second or 

foreign language setting 
Country 

Participant’s age 

SL 

FL 

Adults 




Children 

Adolescents 

Library size 

Treatment 

Small 

Medium 

Large 


Length 

Short 

Medium 

Long 


Text type 

Paper 

Web 



Implementation 

Outcome 

Type A 

Type B 

Type C 

Type D 

Dependent variable 
Statistical information of 

RC 

RR 

V 

Standard 

deviation 

Effect 

sizes 

control and experimental 
Group 

N-size 

Mean 




Notes. Type A: ER as an independent reading course, Type B: ER as a part of reading course, 
Type C: ER as a part of curriculum, Type D: ER as an extracurricular activity. RC= reading 
comprehension, RR= reading rate, V= vocabulary 

Treatment information included length of the program and the type of text used for reading. The 
length of the reading programs was coded into three categories of long, medium, and short term. 
Since most studies were carried out in semester units, the investigations carried out for about a 
semester or less (i.e. ,<5 months) were classified as short tenn; the investigations which lasted 
up to an academic year (i.e., 6 ~ 10 months) were classified as medium term; and the 
investigation over one academic year were classified as long-term. The type of text was 
classified into two categories of paper or web depending on the format of the reading text. 
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How ER was integrated in the language curriculum was classified in accordance with Day and 
Bamford (1998, p. 41): 

• as a separate, stand-alone course 

• as part of an existing reading course 

• as a non-credit addition to an existing course 

• as an extracurricular activity 

Finally, outcome information included test scores for reading comprehension and vocabulary, 
while words read per minute was coded for reading rate. Specifically, all the necessary statistical 
information to compute effect sizes (e.g., number of participants, means, standard deviations) 
was coded. 

Meta-analysis 

We used the Comprehensive Meta-Analysis Program to calculate the effect sizes. Effect sizes 
can be calculated in several ways, such as Cohen’s d. Hedges’ g, Pearson’s r, and Glass’s delta. 
Cohen’s d was adopted in this study because it is considered to be the most typical way to 
estimate effect sizes (Cohen, 1977). 

Cohen (1977, 1988) proposed a guideline for interpreting the effect size derived from 
standardized mean differences based on his observations of the typical range of findings in social 
science research: small effects {d < 0.2), medium effects (0.2 < d < 0.8), and large effects (0.8 < 
d). However, effect sizes are best understood when they are interpreted within a specific domain. 
Based on the observed effect from 346 primary studies and 91 meta-analyses, Plonsky and 
Oswald (2014) suggested a new field-specific scale for L2 research. As shown in Table 2, their 
benchmark provides a higher range than that of Cohen’s. 


Table 2. Benchmark for interpreting effect 
(Plonsky & Oswald, 2014) 

sizes in 

L2 research 



Small 

Medium 

Large 

d Between group contrast 

0.4 

0.7 

1.0 

Pre- to post-test contrast 

0.6 

1.0 

1.4 


The effect sizes based on experimental- and control-group comparisons were meta-analyzed 
separately from those based on pre- and post-test comparisons (Lipsey &Wilson, 2001). 


Results 

Publication bias 

A funnel plot and Fail-safe N test were employed to check for the publication bias. If the 
distribution of the studies forms a symmetrical funnel shape, there is no publication bias. 
However, deviation from this shape may indicate the existence of publication bias and of more 
variability than expected from simple sampling error. The distribution of the observed studies is 
more or less symmetrical as shown in Figures 1 and 2. 


Reading in a Foreign Language 28(2) 



Jeon & Day: The effectiveness of ER on reading proficiency 


252 



Figure 1. Funnel plot of experimental- vs. control-group comparison. 
Note. The diamond indicates the summary effect. 



Figure 2. Funnel plot of experimental- vs. control-group comparison. 
Note. The diamond indicates the summary effect. 


Also the results of Fail-safe N test revealed that unpublished studies reporting non-significant 
findings are unlikely to reverse the findings. For group comparisons, 8530 missing studies are 
needed; and for pre-to-post-test comparisons, 1506 studies are needed to bring p-value to p > .05 
level. Therefore, we can have confidence in the data because it is very unlikely that 8530 and 
1506 studies with nil-null results were not included in the meta-analysis because they failed to be 
published or because they were simply overlooked by the researcher in the literature review. 


Overall effectiveness of ER 
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The first research question concerns the overall effectiveness of an ER approach on reading 
proficiency. The overall effectiveness of ER was synthesized separately based on the study 
design (i.e., experimental- vs. control-group design or pre-to-post-test design) as shown in Tables 
3 and 4. If groups in a study consisted of different members, they were considered as unique 
samples and the effect sizes were yielded for each of them. In such cases, more than one sample 
was drawn from a single study. For example, Lai (1993a)’s study provided seven samples from 
seven independent schools (i.e., School A ~ School H). 

The overall effectiveness aggregated from 51 experimental- vs. control-group contrasts was 0.57. 
This indicates the superiority of the ER group over the intensive or traditional reading group on 
the immediate post-test. The confidence interval shows that with 95% certainty the true effect is 
far from zero, falling anywhere within the medium effect range (between 0.46 and 0.68). 
Moreover, when the effect size was viewed in terms of the average percentile standing, the mean 
of the experimental group was at the 72th percentile of the control group. This clearly shows the 
impact of ER when compared to a traditional reading approach. Finally, the homogeneity test 
was found to be statistically significant (Q = 316.80, df= 50 ,p < .01), which indicated that there 
is more variability in effect sizes than would be expected from sampling error around the mean. 
In other words, there is evidence for additional systematic sources of variability (i.e., 
moderators). 


Table 3. Treatment effects for experimental- vs. control- group comparisons 



N 

(Exp) 



95% 

95% 


Study names 

D 

SE 

Cl 

Cl 

Outcome 




lower 

upper 


Alavi & Kayvanshekoh (2012) 

18 

0.63 

0.17 

0.30 

0.96 

V 

Al-Homoud & Schimitt (2009) 

47 

0.15 

0.22 

-0.29 

0.59 

RC, RR 

Beglar et al. (2012) 

35 

0.93 

0.31 

0.32 

1.53 

RC, RR 

Bell (2001) 

14 

1.70 

0.40 

0.91 

2.48 

RC, RR 

Burrows (2012) 

74 

0.75 

0.17 

0.42 

1.08 

RC 

Cha (2009) 

10 

0.98 

0.48 

0.04 

1.92 

RR, V 

Chen, Chen, Chen & Wei 
(2013) 

46 

0.61 

0.22 

0.18 

1.04 

RC 

Cho, Kim, & Krashen (2004) 

70 

0.45 

0.17 

0.11 

0.78 

RC, V 

Cho & Choi (2008) 

28 

0.14 

0.27 

-0.39 

0.67 

RC 

de Morgado (2009) 

30 

0.32 

0.26 

-0.19 

0.83 

RC 

Elley (1991) study3 

256 

0.33 

0.09 

0.15 

0.50 

RC, V 

Elley (1991) study3 

350 

0.32 

0.08 

0.16 

0.47 

RC, V 

Elley & Mangubhai (1983) 
class 5 

70 

1.08 

0.17 

0.75 

1.41 

RC 

Elley & Mangubhai (1983) 
class 6 

64 

0.79 

0.17 

0.46 

1.12 

RC, V 

Jeon (2008) 

17 

0.72 

0.33 

0.07 

1.37 

RC 

Kim & Lim (2012) 

35 

0.65 

0.25 

0.16 

1.14 

RC 

Lai (1993a) School A 

86 

0.47 

0.15 

0.18 

0.76 

RC, V 
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Lai (1993a) School B 

59 

-0.32 

0.19 

-0.69 

0.05 

RC 

Lai (1993a) School C 

83 

0.32 

0.16 

0.01 

0.63 

RC, V 

Lai (1993a) School D 

77 

0.39 

0.17 

0.07 

0.71 

RC, V 

Lai (1993a) School E 

40 

-0.03 

0.23 

-0.48 

0.42 

RC, V 

Lai (1993a) School G 

39 

-0.27 

0.24 

-0.74 

0.20 

RC 

Lai (1993a) School H 

36 

0.02 

0.23 

-0.43 

0.47 

RC 

Lee (2007) study 2 

67 

0.10 

0.15 

-0.19 

0.39 

RC, V 

Lee (2007) study 3 

41 

0.54 

0.18 

0.19 

0.89 

RC, V 

Lituanas, Jacobs, & Renandya 
(2001) 

30 

1.53 

0.30 

0.95 

2.11 

RC, RR 

Mason & Krashen (1997) 
study 1 

20 

-0.17 

0.32 

-0.80 

0.46 

RC 

Mason & Krashen (1997) 
study 2 University 

40 

0.60 

0.23 

0.15 

1.05 

RC 

Mason & Krashen (1997) 
study 2 Junior college 

31 

0.85 

0.31 

0.24 

1.46 

RC 

Mason & Krashen (1997) 
study 3 

40 

0.78 

0.22 

0.35 

1.21 

RC, RR 

Matsui & Noro (2010) 

60 

0.27 

0.18 

-0.09 

0.62 

RC, RR 

Mermelstein (2014) 

41 

1.04 

0.23 

0.59 

1.49 

V 

Nakashini & Ueda (2011) 

20 

-0.41 

0.31 

-1.02 

0.20 

RC 

Rezaee & Nourzadeh (2011) 

26 

0.67 

0.20 

0.28 

1.06 

RC 

Robb & Kano (2013) 

Economics major 

543 

1.00 

0.06 

0.88 

1.12 

RC 

Robb & Kano (2013) 

Business major 

587 

0.92 

0.06 

0.80 

1.04 

RC 

Robb & Kano (2013) 

Law major 

546 

1.07 

0.07 

0.93 

1.21 

RC 

Robb & Kano (2013) 

Foreign language major 

254 

0.96 

0.09 

0.78 

1.14 

RC 

Robb & Kano (2013) 

Science major 

100 

0.96 

0.15 

0.67 

1.25 

RC 

Robb & Kano (2013) 

Technology major 

48 

0.66 

0.20 

0.27 

1.05 

RC 

Robb & Kano (2013) 

Computer major 

142 

1.34 

0.13 

1.09 

1.59 

RC 

Robb & Susser (1989) 

63 

0.62 

0.19 

0.26 

0.98 

RC, RR 

Sheu (2003) 

34 

0.78 

0.26 

0.28 

1.28 

RC, RR 

Shin (2013) 

35 

0.56 

0.24 

0.09 

1.03 

V 

Sims (1996) Group A 

30 

0.80 

0.27 

0.27 

1.33 

RC 

Sims (1996) Group B 

30 

0.64 

0.26 

0.13 

1.15 

RC 

Smith (2006) 

51 

0.24 

0.20 

-0.15 

0.63 

RC 

Tanaka & Stapleton (2007) 

96 

0.40 

0.15 

0.11 

0.69 

RC, RR 

Tsang (1996) 

48 

0.60 

0.21 

0.19 

1.01 

V 
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Weitz (2003) 

43 

0.20 

0.22 

-0.24 

0.63 

RC, V 

Yamamoto (2011) 

33 

0.48 

0.14 

0.21 

0.75 

V 

Overall 

4,683 

0.57 

0.06 

0.46 

0.68 

RC, RR, V 


Notes. RC= reading comprehension, RR= reading rate, V= vocabulary 


Similarly, the overall effect from 20 pre-test to post-test comparisons was small to medium (d = 
0.79). Moreover, the numbers of participants involved in pre-test to post-test comparison studies 
tended to be smaller than those of experimental vs. control design, which led to twice as high 
standard error (SE = 0.09) than the experimental vs. control design (SE = 0.06). Furthermore, the 
homogeneity test was also found to be statistically significant (Q = 68.88, df= 19 ,p < .01). 


Table 4. Treatment effects for pretest to posttest comparisons 


Study names 

N 

(Exp) 

d 

SE 

95%CI 

lower 

95% Cl 
upper 

Outcome 

Fujita & Noro (2009) 

68 

0.32 

0.19 

-0.05 

0.69 

RC, RR 

Hafiz & Tudor (1989) 

16 

1.04 

0.22 

0.61 

1.47 

RC 

Hayashi (1999) 
beginning level 

35 

1.29 

0.26 

0.77 

1.80 

RC, V 

Hayashi (1999) 
intermediate level 

40 

0.90 

0.23 

0.44 

1.36 

RC, Y 

Horst (2005) 

17 

1.57 

0.28 

1.02 

2.11 

V 

Huang & Liou (2007) 

38 

0.64 

0.24 

0.18 

1.11 

V 

Iwahori (2008) 

33 

0.64 

0.25 

0.15 

1.14 

RC, RR 

Kim(2014) 

249 

0.51 

0.10 

0.32 

0.70 

RC, RR 

Kim & Hwang (2006) 

20 

0.72 

0.33 

0.08 

1.36 

V 

Lai (1993b) SI 

126 

0.39 

0.14 

0.12 

0.67 

RC, RR 

Lai (1993b) S2 

88 

1.32 

0.17 

0.98 

1.65 

RC, RR 

Lai (1993b) S3 

52 

0.12 

0.22 

-0.31 

0.56 

RC, RR 

Mason (2003) 

30 

0.95 

0.27 

0.41 

1.48 

RC 

Park & Kang (2004) 

35 

0.80 

0.25 

0.31 

1.29 

RC, Y 

Taguchi et al. (2004) 

10 

1.44 

0.51 

0.45 

2.44 

RC, RR 

Takase (2007) 

216 

1.09 

0.10 

0.88 

1.29 

RC 

Takase (2009) group 3 

36 

0.65 

0.24 

0.18 

1.13 

RC 

Yamashita (2008) 

31 

0.41 

0.18 

0.06 

0.77 

RC 

Yang (2010) 

79 

0.73 

0.16 

0.41 

1.05 

RR 

Zimmerman (1997) 

17 

0.89 

0.36 

0.19 

1.60 

V 

Overall 

1,236 

0.79 

0.09 

0.62 

0.96 

RC, RR, 

V 


Notes. RC= reading comprehension, RR= reading rate, V= vocabulary. 


It can be concluded that effectiveness of an ER approach was small to medium with respect to 
Plonsky and Oswald (2014)’s proposed benchmark for interpreting effect sizes in SLA. That is, 
experimental groups outperformed control groups on their immediate post-test and there was a 
small to medium improvement in reading proficiency from the pre-test to post-test. The overall 
supremacy of ER over intensive or traditional reading approach is consistent with previous meta- 
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analyses. 

The moderator analyses 

In order to assess differences among moderators, a mixed-effect model was employed. 

Moderator analyses were performed in terms of identification, context, treatment, and outcome 
variables as shown in Table 5. Q-Statistics were employed to find out which variable type was a 
significant moderator. Moderator analysis was only performed for control vs. experimental group 
analysis since pre-to post-test contrast had small numbers of effect sizes for aggregation which, 
as a result, can jeopardize the validity of the analysis. 


Table 5. Moderator analysis for experimental- vs. control group comparisons 


Group 

Moderator 




SE 

95%CI 

95%CI 


Variable 


d 

k 

lower 

upper 

Q 

Identification 

Year 

1980s 

1990s 

0.83 

0.33 

3 

16 

0.17 

0.08 

0.51 

0.18 

1.16 

0.48 

22.87* 

2000s 

0.49 

14 

0.09 

0.31 

0.66 



2010s 

0.78 

18 

0.07 

0.65 

0.92 



Setting 

EFL 

0.65 

37 

0.06 

0.53 

0.77 

5.89* 


ESL 

0.38 

14 

0.09 

0.20 

0.56 



Children 

0.52 

6 

0.13 

0.26 

0.78 


Context 

Age 

Adolescents 

0.35 

15 

0.09 

0.17 

0.53 

9.14* 


Adults 

0.70 

30 

0.06 

0.56 

0.81 



Library 

Small 

0.64 

4 

0.19 

0.27 

1.01 



Medium 

0.50 

17 

0.09 

0.33 

0.67 

2.02 


size 

Large 

0.72 

8 

0.13 

0.46 

0.97 




Short 

0.51 

16 

0.01 

0.31 

0.71 



Length 

Medium 

0.59 

31 

0.01 

0.46 

0.73 

0.49 



Long 

0.60 

4 

0.03 

0.25 

0.96 


Treatment 

Text type 

Paper 

Web 

0.47 

0.89 

41 

10 

0.05 

0.09 

0.37 

0.73 

0.57 

1.06 

18.48* 



Type A 

0.24 

1 

0.20 

-0.15 

0.63 



ER Form 

Type B 

0.47 

38 

0.06 

0.35 

0.58 

30.90* 


Type C 

0.91 

11 

0.06 

0.79 

1.03 



Type D 

0.67 

1 

0.20 

0.28 

1.06 



Focus 

skills 

RC 

0.54 

46 

0.06 

0.42 

0.65 


Outcome 

RR 

0.83 

10 

0.14 

0.55 

1.11 

4.65 


V 

0.47 

17 

0.10 

0.28 

0.66 



Notes. Type A: ER as an independent reading course, Type B: ER as a part of reading course, Type C: 
ER as a part of curriculum, Type D: ER as an extracurricular activity. *p < 0.05 


(a) Identification, Fifty-one (51) samples were grouped into four different periods of 
publication: 1980s, 1990s, 2000s, and 2010s. Although 1980s had the highest mean 
effect (d = 0.83), increase in effect sizes was shown as the years went by (i.e., 1990s: d 
= 0.33; 2000s: d = 0.49; 2010s: d = 0.78), and the difference among moderators were 
found to be statistically significant (Q = 22.87, p < 0.05). Increase not only in the mean 
effect of ER but also in the number of studies reflected the development and interest in 
ER in the field of applied linguistics over the last 30 years 
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(b) Setting. The difference between ESL (d = 0.38) and EFL (d = 0.65) settings was found 
to be statistically significant (Q = 5.89, p < 0.05). Although an ESL setting is generally 
considered to be a better environment to practice ER than an EFL setting, the present 
result indicated the promising outlook of ER in EFL settings. However, this result 
should be interpreted with caution since EFL settings (k = 37) involved more studies 
than ESL settings (k = 14), which may have influenced the standard error of the effect 
size. Moreover, ESL studies did not involve any adult groups which showed higher 
mean effect than other age groups. It is also worth noting that the majority of EFL and 
ESL countries included in the meta-analysis are Asian Pacific countries rather than 
North American or European countries as shown in Figures 3 and 4. 



Figure 3. Distribution of EFL settings. 



Figure 4. Distribution of ESL settings. 


(c) Age. A significant difference among different age groups was found (Q = 9.14 ,p< 
0.05), which indicated that the age of the participants can influence the efficacy of the 
ER approach. The highest mean effect size was found in adults group (d = 0.70) 
followed by children (d = 0.52) and adolescents (d = 0.35) groups. The adult group, in 
particular, showed over twice as high an effect as the adolescent group. However, 
studies were heavily concentrated on adult groups (k = 30) which could have 
influenced the standard error of its effect. Currently, only six samples and 15 samples 
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were gathered for children and adolescent group, respectively. More research on these 
groups is needed to confirm this finding. 

(d) Library size. The test of homogeneity indicated no statistical difference in library size. 
Therefore, the library size was found to have no significant influence on the impact of 
ER in this research. 

(e) Length of treatment. No statistical difference was found among the length of programs. 
In fact, they showed very similar aggregated effect sizes (short: d = 0.51, medium: d = 
0.59, long: d = 0.60) which implies that ER can be effective regardless of the program 
length. This was somewhat contradictory to the natural expectation that the long 
program would have greater impact on ER than short programs. The reason for this 
could be due to the fact that the long program had, much smaller sample size (k = 4) 
than the other two programs {short: k= 16, medium: k = 31). 

(f) Text type. A significant difference between text types was found (Q = 18.48, p < 0.05). 
Surprisingly, web text {d = 0.89) showed almost double the effect of paper text (d = 
0.47). This result is in accordance with Hung (201 l)’s study which revealed that the 
ER of multimodal text was as effective as that of printed linear text in improving 
Taiwanese undergraduate EFL learners’ English proficiency in simulated TOEIC tests. 
However, this result should be interpreted with caution since the web text group had 
smaller samples ( k = 10) than the paper text groups (k = 41). Also, web text group was 
mostly comprised of adult groups which showed the highest aggregated effect size 
compared to other age groups. Only one study (i.e., Cho & Kim, 2004) had a children 
group. 

(g) ER fonn. Type B (ER as a part of reading course) is found to be by far the most 
common type of ER form in the field (k = 38). The Q-test indicated a significant 
difference among variables {Q = 30.90. p < 0.05). ER as part of a curriculum showed 
the highest mean effect {d = 0.91), followed by ER as an extracurricular activity (d = 
0.67), ER as a part of reading course {d = 0.47), and ER as an independent reading 
course {d = 0.24). However, the result should be interpreted with caution since ER 
forms are heavily concentrated in either Type B (ER as a part of reading course) or 
Type C (ER as a part of curriculum). Moreover, the studies in Type A (ER as an 
independent reading course) or Type D (ER as an extracurricular activity) were very 
hard to find. Only one study was found for each Type. Although Type A, which 
employs ER as an independent reading course, seems to be the most ideal way of 
implementing ER, it showed the smallest effect (d = 0.24). ER showed a higher effect 
when it was accompanied by other classroom activities such as interactive vocabulary 
instruction, group discussion, or oral presentation rather than devoting the entire class 
to only reading. 

(h) Focus skills. No statistical difference was found among outcome variables, even 
though reading rate showed notably higher effect size (d = 0.83) than reading 
comprehension {d = 0.54) or vocabulary (d = 0.47). This may be due to the fact that the 
sample size of reading rate (k =10) was not large enough to make a significant 
difference among variables. 
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Discussion 

The first research question concerned the overall effectiveness of ER on reading proficiency 
(reading comprehension, reading rate, and vocabulary). The results showed that there was small 
to medium effect for both experimental-versus control group design (d = 0.57) and pre-to-post- 
test (d = 0.79) design. The effect can also be compared with regard to other meta-analytical 
findings in the field of second language studies. For example, the summary effect of corrective 
feedback was d= 1.16 (Russell & Spada, 2006), the effect of computer-assisted vocabulary 
instruction was d = 0.75 (Chiu, 2013), the effect of strategy instruction was d = 0.49 (Plonsky, 
2011), and the effect of visual input enhancement was d = 0.22 (Lee & Huang, 2008). 

Currently, however, an intensive reading approach is the major approach to teaching reading in 
the field and it is of no use if ER is not practiced in the school settings despite its reported 
benefits. One possible way to encourage the widespread use of ER is to inform and educate 
teachers, administrators, and policy makers of the effectiveness of ER. Without convincing them 
of its advantages over traditional teaching, it would be very difficult to adopt an ER approach in 
school settings. 

First of all, policy makers and teachers should realize the fact that it takes time to see the benefits 
of ER (Lee, 2007). Therefore, once ER is implemented in the curriculum, it should not be 
removed quickly because its impact is not immediate or apparent. Also, teachers should change 
their view of considering ER as additional work that just takes up classroom time (Robb & Kano, 
2013). 

The second research question was whether the effects of ER differed significantly depending on 
the moderator variables. As for the identification variable, the number of studies on ER and the 
mean effect of ER increased gradually over last 30 years. This shows not only growing interest 
in ER but also developing expertise in ER classroom implementation, which suggests a 
promising perspective for its future. 

As for contextual variable, settings and age seemed to affect the outcome of ER. Although an 
ESL setting is thought to be a better surrounding for learning English, an EFL setting can also 
produce positive effects despite its limited input environment. Moreover, an ER approach 
showed highest effect with adults and lowest effect with adolescents. The high mean effect of 
adults was also reported in Nakanishi (2015) and low mean effect of junior high students and 
high school students were also found in Kim (2012). This may be due to the fact that the adult 
group is better equipped cognitively than other age groups to start reading extensively. In other 
words, adults have better capacity to take advantage of ER since they have more experience, 
background knowledge, and vocabulary. The lowest impact of ER on adolescents, on the other 
hand, could be due to the test-centered curriculum in the schools. Despite the fact that 
adolescents are superior in their cognitive ability compared to children, they may not have any 
interest in ER since it does not have a direct impact on their grades. 

One of the biggest hurdles of implementing ER is setting up a library, which, in fact, can require 
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quite a large budget. In this study, the library size was found to have no significant influence on 
the impact of ER. Hence, teachers can start ER on a small scale in their classrooms when the 
school cannot provide a library. Moreover, the high effect of web-based stories suggests the 
possibility of adopting a computer reader program instead of setting up an expensive library. 

Hung (2011), to illustrate, reported that the higher effect of multimodal text (i.e., computer 
reader program) was found in comparison with linear text (i.e., paper books) for EFL 
undergraduates. Although most of the studies included in the meta-analysis had a class or a 
school library with a wide range of books where students could freely borrow the books of their 
choice, many schools, in reality, may not have the funding necessary to set up a library. 

A computer reader program can be a great substitute for a library. Moodle Reader (Robb & Kano, 
2013), for example, not only provides students with vast amount of reading material, but it can 
also track students’ reading progress and comprehension. It can be used outside of the classroom, 
so it does not intrude on classroom time. Moreover, it saves teachers from a lot of work such as 
going over students’ reports to confirm their understanding. Furthennore, students’ work on 
Moodle Reader is maintained in Excel documents which can be incorporated into a student’s 
final grade. In short, a computer reader program is a cost-effective way to make ER more 
accessible. 

However, it should be bom in mind that which text type (paper books or digital books) is better 
for promoting reading proficiency is still controversial in both English as a first language and 
English in second or foreign language settings. For example, Empey (2013) reported negative 
evidence for digital books, claiming that it is difficult to keep focus on digital texts. Pino-Silva 
(2006) and Hung (2011), on the other hand, reported positive evidence such as enjoying the 
benefit of having access to hundreds of up-to-date magazines through the web. 

Moreover, some students, particularly young learners, may not feel comfortable about using 
computers for reading or may not be ready to read independently outside of the classroom. In 
such cases, it may be better to set up a small classroom library. Teachers may prefer to set up a 
library with classical literature; but the readers’ interests come first because students will not 
read extensively or voluntarily unless they are interested in the material. Teachers can find out 
their students’ interests through questionnaires and purchase books or magazines accordingly 
(Day & Bamford, 1998). 

Finally, the lack of class time issue in ER can be resolved by implementing ER as a part of the 
school curriculum. In this way, students’ reading is not limited to classroom hours but is 
extended to reading outside the classroom. Whereas other ER forms engage students’ interest in 
reading for a short term (i.e., only while research is conducted), incorporating ER into the 
curriculum can motivate students to read more over time. Moreover, implementing ER as a part 
of the school curriculum showed the highest mean effect among four different ER types. 


Conclusion 

This study examined the overall effectiveness of ER on reading proficiency in ESL and EFL 
settings. It also investigated how identification, context, treatment, and outcome variables 
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influenced the outcome of ER programs. Our meta-analysis demonstrated the overall 
effectiveness of an ER approach compared to an intensive or traditional reading approach. Also, 
the effect of ER can be maximized with the adults group, computer reader programs, and ER in 
the curriculum. Finally, the results showed a promising outlook for practicing ER in EFL settings. 

Due to the financial and time issues, however, ER is yet not widely practiced in the field, 
particularly in EFL settings. Nevertheless, this study showed ER implementation can be made 
easy when ER becomes a part of a school curriculum and the cost of setting up a library can be 
significantly reduced by replacing paper books with digital books. Nowadays, technologies are 
evolving rapidly and the literacy of multimodal text is indispensable in this era of Information 
Computer Technology (Kress, 2003). 

Although an individual teacher can practice ER program in his or her own classroom, ER 
programs can be greatly facilitated by systematic support from the schools or the government. 

For example, the schools or the government can provide a variety of books through the school 
library or computer programs which an individual teacher cannot afford on his or her own. 

We hope this study provides a helpful and insightful summary of empirical findings on an ER 
approach in EFL and ESL settings. A limitation of this study is that most of the studies included 
in the analysis were from an Asian Pacific context, which may not reflect a balanced view of ER 
practice around the world. Also, long-term studies that go over a year and children-participated 
studies were very hard to find. Consequently, this resulted in a small number of samples in long¬ 
term studies and children-participated studies which may not have been large enough to reflect 
the true effect of ER in the long run and on children. More research on these topics is needed. 
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